26/06/2018

Welcome, and overview

First session:

  • Not heaps of paleo-specific R
  • But building blocks to make you an expeRt
  • Things that go into R (data inputs)
  • How to structure your data inputs and outputs
  • Getting started in R

Welcome, and overview

Second session:

  • Data validation
  • Data visualisation
  • NMDS
  • RDA
  • Plotting NMDS etc for publications
  • Saving and export

BD (before data): project structures

  1. Raw data (as entered)
  2. Corrected and modified data

BD (before data): project structures

Keep a record of how you went from (1) to (2) - even if you don't do it in R

  1. Correct/modify data in R (with reminders)
  2. Create a new spreadsheet and keep a .txt records

Project structures

Where should a project live?

Pros and cons of the following:

  • MW-LCR shared drives
  • Dropbox (C:/)
  • Github (C:/)
  • MW-LCR personal drive

Where should a project live

githubScreenshot

githubScreenshot

Within the project: naming files

Machine readable

  • no punctuation symbols
  • no spaces
  • be careful with capitals
  • for data, easy to parse

Machine readable

  • e.g. year_site_coreNUM_type
  • e.g. 2018-06-31_eweburn_X18-062_concentrations.csv
  • e.g. 2018-06-31_eweburn_X18-062_age-depth.csv
  • e.g. 2018-06-31_eweburn_X18-062_species-dictionary.csv

note that we separate units of metadata with a "_" and within units, with a "-".

Human readable

  • This applies to scripts and data

  • e.g. 1_data-cleaning-vegetation.R
  • e.g. 2_data-cleaning-species-dictionary.R
  • e.g. function_plot-all-species.R
  • e.g. function_clean-italics-tilia.R

Group discussion - data & spreadsheets

example data screenshot

example data screenshot

Booting up the R

rstudio

rstudio

The basics (1)

getwd()
## [1] "/Users/oliviaburge/Documents/paleo-R-workshop/1-folders-spreadsheets-organisingData"

The basics (2)

setwd()

The basics (3)

sessionInfo()
## R version 3.5.0 (2018-04-23)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.5
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] DiagrammeR_1.0.0
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.17       highr_0.6          pillar_1.2.1      
##  [4] compiler_3.5.0     RColorBrewer_1.1-2 influenceR_0.1.0  
##  [7] plyr_1.8.4         bindr_0.1.1        viridis_0.5.1     
## [10] tools_3.5.0        digest_0.6.15      jsonlite_1.5      
## [13] viridisLite_0.3.0  gtable_0.2.0       evaluate_0.10.1   
## [16] tibble_1.4.2       rgexf_0.15.3       pkgconfig_2.0.1   
## [19] rlang_0.2.1        igraph_1.2.1       rstudioapi_0.7    
## [22] yaml_2.1.19        bindrcpp_0.2.2     gridExtra_2.3     
## [25] downloader_0.4     dplyr_0.7.5        stringr_1.3.1     
## [28] knitr_1.20         htmlwidgets_1.2    hms_0.4.2         
## [31] grid_3.5.0         rprojroot_1.3-2    tidyselect_0.2.4  
## [34] glue_1.2.0.9000    R6_2.2.2           Rook_1.1-1        
## [37] XML_3.98-1.11      rmarkdown_1.10     ggplot2_2.2.1.9000
## [40] tidyr_0.8.0        purrr_0.2.5        readr_1.1.1       
## [43] magrittr_1.5       backports_1.1.2    scales_0.5.0      
## [46] htmltools_0.3.6    assertthat_0.2.0   colorspace_1.3-2  
## [49] brew_1.0-6         stringi_1.2.3      visNetwork_2.0.3  
## [52] lazyeval_0.2.1     munsell_0.4.3

The basics (4)

require(tidyverse)
## Loading required package: tidyverse
## ── Attaching packages ────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 2.2.1.9000     ✔ purrr   0.2.5     
## ✔ tibble  1.4.2          ✔ dplyr   0.7.5     
## ✔ tidyr   0.8.0          ✔ stringr 1.3.1     
## ✔ readr   1.1.1          ✔ forcats 0.3.0
## ── Conflicts ───────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

Example data

# install.packages("vegan")
# install.packages("skimr")
require(vegan)
require(skimr)
data("mite")             # the data command only works for in-built data
data("mite.env")         # we'll cover reading in your own data later on

Viewing data

head(mite, n = 6)
Brachy PHTH HPAV RARD SSTR Protopl MEGR MPRO TVIE HMIN HMIN2 NPRA TVEL ONOV SUCT LCIL Oribatl1 Ceratoz1 PWIL Galumna1 Stgncrs2 HRUF Trhypch1 PPEL NCOR SLAT FSET Lepidzts Eupelops Miniglmn LRUG PLAG2 Ceratoz3 Oppiminu Trimalc2
17 5 5 3 2 1 4 2 2 1 4 1 17 4 9 50 3 1 1 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 7 16 0 6 0 4 2 0 0 1 3 21 27 12 138 6 0 1 3 9 1 1 1 2 2 2 1 0 0 0 0 0 0 0
4 3 1 1 2 0 3 0 0 0 6 3 20 17 10 89 3 0 2 1 8 0 3 0 2 0 8 0 0 0 0 0 0 0 0
23 7 10 2 2 0 4 0 1 2 10 0 18 47 17 108 10 1 0 1 2 1 2 1 3 2 12 0 0 0 0 0 0 0 0
5 8 13 9 0 13 0 0 0 3 14 3 32 43 27 5 1 0 5 2 1 0 1 0 0 0 12 2 0 0 0 0 0 0 0
19 7 5 9 3 2 3 0 0 20 16 2 13 38 39 3 5 0 1 1 8 0 4 0 1 0 10 0 0 0 0 0 0 0 0

The basics (5)

  • When you just want one column, you can use the following format dataframeNAME$columnNAME
mite$Brachy
##  [1] 17  2  4 23  5 19 17  5  3 22 36 28  3 41  6  7  9 19 12  3  5  4 19
## [24]  4 12  6  4  9 42 20 12  4 38  5  3  3  3  8  0  1  2  0  5  0 11  4
## [47]  0  0 10  4  2  3  3  2  1  1  0  0  1  1  6  3 19  3  4  8  4  6 20
## [70]  5

Viewing data

skim(mite)
## Skim summary statistics
##  n obs: 70 
##  n variables: 35 
## 
## ── Variable type:integer ──────────────────────────────────────────────────────────────────
##  variable missing complete  n  mean    sd p0  p25  p50   p75 p100     hist
##    Brachy       0       70 70  8.73 10.08  0 3     4.5 11.75   42 ▇▂▁▂▁▁▁▁
##  Ceratoz1       0       70 70  1.29  1.46  0 0     1    2       5 ▇▆▁▃▁▁▁▁
##  Ceratoz3       0       70 70  1.3   2.2   0 0     0    2       9 ▇▁▁▁▁▁▁▁
##  Eupelops       0       70 70  0.64  0.99  0 0     0    1       4 ▇▃▁▁▁▁▁▁
##      FSET       0       70 70  1.86  3.18  0 0     0    2      12 ▇▂▁▁▁▁▁▁
##  Galumna1       0       70 70  0.96  1.73  0 0     0    1       8 ▇▁▁▁▁▁▁▁
##      HMIN       0       70 70  4.91  8.47  0 0     0    4.75   36 ▇▁▁▁▁▁▁▁
##     HMIN2       0       70 70  1.96  3.92  0 0     0    2.75   20 ▇▂▁▁▁▁▁▁
##      HPAV       0       70 70  8.51  7.56  0 4     6.5 12      37 ▇▇▃▃▁▁▁▁
##      HRUF       0       70 70  0.23  0.62  0 0     0    0       3 ▇▁▁▁▁▁▁▁
##      LCIL       0       70 70 35.26 88.85  0 1.25 13   44     723 ▇▁▁▁▁▁▁▁
##  Lepidzts       0       70 70  0.17  0.54  0 0     0    0       3 ▇▁▁▁▁▁▁▁
##      LRUG       0       70 70 10.43 12.66  0 0     4.5 17.75   57 ▇▂▂▁▁▁▁▁
##      MEGR       0       70 70  2.19  3.62  0 0     1    3      17 ▇▂▁▁▁▁▁▁
##  Miniglmn       0       70 70  0.24  0.79  0 0     0    0       5 ▇▁▁▁▁▁▁▁
##      MPRO       0       70 70  0.16  0.47  0 0     0    0       2 ▇▁▁▁▁▁▁▁
##      NCOR       0       70 70  1.13  1.65  0 0     0.5  1.75    7 ▇▃▂▂▁▁▁▁
##      NPRA       0       70 70  1.89  2.37  0 0     1    2.75   10 ▇▂▂▁▁▁▁▁
##      ONOV       0       70 70 17.27 18.05  0 5    10.5 24.25   73 ▇▃▂▁▁▁▁▁
##  Oppiminu       0       70 70  1.11  1.84  0 0     0    1.75    9 ▇▁▁▁▁▁▁▁
##  Oribatl1       0       70 70  1.89  3.43  0 0     0    2.75   17 ▇▁▁▁▁▁▁▁
##      PHTH       0       70 70  1.27  2.17  0 0     0    2       8 ▇▁▁▁▁▁▁▁
##     PLAG2       0       70 70  0.8   1.79  0 0     0    1       9 ▇▁▁▁▁▁▁▁
##      PPEL       0       70 70  0.17  0.54  0 0     0    0       3 ▇▁▁▁▁▁▁▁
##   Protopl       0       70 70  0.37  1.61  0 0     0    0      13 ▇▁▁▁▁▁▁▁
##      PWIL       0       70 70  1.09  1.71  0 0     0    1       8 ▇▁▁▁▁▁▁▁
##      RARD       0       70 70  1.21  2.78  0 0     0    1      13 ▇▂▁▁▁▁▁▁
##      SLAT       0       70 70  0.4   1.23  0 0     0    0       8 ▇▁▁▁▁▁▁▁
##      SSTR       0       70 70  0.31  0.97  0 0     0    0       6 ▇▁▁▁▁▁▁▁
##  Stgncrs2       0       70 70  0.73  1.83  0 0     0    0       9 ▇▁▁▁▁▁▁▁
##      SUCT       0       70 70 16.96 13.89  0 7.25 13.5 24      63 ▇▇▆▅▂▁▁▁
##  Trhypch1       0       70 70  2.61  6.14  0 0     0    2      29 ▇▁▁▁▁▁▁▁
##  Trimalc2       0       70 70  2.07  5.79  0 0     0    0      33 ▇▁▁▁▁▁▁▁
##      TVEL       0       70 70  9.06 10.93  0 0     3   19      42 ▇▁▁▂▁▁▁▁
##      TVIE       0       70 70  0.83  1.47  0 0     0    1       7 ▇▁▁▁▁▁▁▁

Live coding demo

  • Select some columns
  • Filter just some observations
  • Histogram of one column

Select - concept

Select chooses certain columns - to keep, or to get rid of. The format is

DATANAME %>% select(col1, col2, col3)

Select - examples

mite %>% 
  select(Brachy, PHTH, HPAV)
##    Brachy PHTH HPAV
## 1      17    5    5
## 2       2    7   16
## 3       4    3    1
## 4      23    7   10
## 5       5    8   13
## 6      19    7    5
## 7      17    3    8
## 8       5    4    8
## 9       3    3    2
## 10     22    4    5
## 11     36    7   35
## 12     28    2   12
## 13      3    2    4
## 14     41    5   12
## 15      6    0    6
## 16      7    2    3
## 17      9    0    1
## 18     19    3    7
## 19     12    2   10
## 20      3    1    7
## 21      5    2    8
## 22      4    0    4
## 23     19    0    8
## 24      4    0    1
## 25     12    4   15
## 26      6    0    4
## 27      4    4    4
## 28      9    0    4
## 29     42    0    6
## 30     20    1    2
## 31     12    0    5
## 32      4    0    9
## 33     38    0   17
## 34      5    0   14
## 35      3    0    0
## 36      3    1    2
## 37      3    0    5
## 38      8    0    6
## 39      0    0    0
## 40      1    0   31
## 41      2    0   10
## 42      0    0   12
## 43      5    0    2
## 44      0    0    2
## 45     11    0    8
## 46      4    0    4
## 47      0    0    8
## 48      0    0    3
## 49     10    0   14
## 50      4    0   37
## 51      2    0    5
## 52      3    0    4
## 53      3    0   17
## 54      2    0    7
## 55      1    0    3
## 56      1    0   16
## 57      0    0    0
## 58      0    0   12
## 59      1    0    0
## 60      1    0   16
## 61      6    0    9
## 62      3    0    5
## 63     19    0    3
## 64      3    0   16
## 65      4    0   10
## 66      8    0   18
## 67      4    0    3
## 68      6    0   22
## 69     20    2    4
## 70      5    0   11

Select - examples

names(mite)
##  [1] "Brachy"   "PHTH"     "HPAV"     "RARD"     "SSTR"     "Protopl" 
##  [7] "MEGR"     "MPRO"     "TVIE"     "HMIN"     "HMIN2"    "NPRA"    
## [13] "TVEL"     "ONOV"     "SUCT"     "LCIL"     "Oribatl1" "Ceratoz1"
## [19] "PWIL"     "Galumna1" "Stgncrs2" "HRUF"     "Trhypch1" "PPEL"    
## [25] "NCOR"     "SLAT"     "FSET"     "Lepidzts" "Eupelops" "Miniglmn"
## [31] "LRUG"     "PLAG2"    "Ceratoz3" "Oppiminu" "Trimalc2"
mite %>% 
  select(-c(PHTH:Oppiminu))
##    Brachy Trimalc2
## 1      17        0
## 2       2        0
## 3       4        0
## 4      23        0
## 5       5        0
## 6      19        0
## 7      17        0
## 8       5        0
## 9       3        0
## 10     22        0
## 11     36        0
## 12     28        0
## 13      3        0
## 14     41        0
## 15      6        0
## 16      7        0
## 17      9        0
## 18     19        0
## 19     12        0
## 20      3        0
## 21      5        0
## 22      4        0
## 23     19        0
## 24      4        0
## 25     12        0
## 26      6        0
## 27      4        0
## 28      9        0
## 29     42        0
## 30     20        0
## 31     12        0
## 32      4        0
## 33     38        0
## 34      5        0
## 35      3        0
## 36      3        0
## 37      3        0
## 38      8        0
## 39      0        0
## 40      1        0
## 41      2        0
## 42      0        0
## 43      5        1
## 44      0        0
## 45     11        0
## 46      4        0
## 47      0        0
## 48      0        1
## 49     10        0
## 50      4        0
## 51      2        1
## 52      3        0
## 53      3        9
## 54      2        1
## 55      1        0
## 56      1        5
## 57      0        0
## 58      0        0
## 59      1        1
## 60      1        1
## 61      6        5
## 62      3        0
## 63     19        8
## 64      3       11
## 65      4       25
## 66      8        9
## 67      4       33
## 68      6       17
## 69     20        3
## 70      5       14

Select - discussion

  • What is the difference between $ and select?
mite$Brachy               # 'base' R
mite %>% select(Brachy)   # tidyverse
mite %>% pull(Brachy)     # we haven't discussed this one yet

What is this? %>% %>% %>%

What is this? %>% %>% %>%

Chaining allows us to write code in the order we want it done. Otherwise, it must be wrapped in brackets with the first thing to be done right in the middle.

mite.env %>% select(Shrub, Topo)

means, take the mite.env dataframe, and then select the columns Shrub and Topo.

Filter - concept

Filter selects rows in your dataframe, based on the conditions you specify. Same format as for select():

filter(DATA, CONDITION1)

  • row only selected if condition satisfied

Filter - concept

filter(DATA, CONDITION1 & CONDITION2)

  • both conditions must be satisfied

filter(DATA, CONDITION1 | CONDITION2)

  • one or both conditions must be satisfied

Filter - example

mite.env %>% filter(WatrCont > 650)
##   SubsDens WatrCont Substrate Shrub    Topo
## 1    64.75   691.79   Sphagn2   Few Blanket
## 2    62.38   708.16  Barepeat   Few Blanket
## 3    52.73   656.35   Sphagn1  None Blanket
## 4    52.12   826.96   Sphagn1  None Blanket

Filter - example with two conditions

Here, Shrub has to equal "Few". If you want to select two values (such as two sites) see the next slide.

mite.env %>% filter(WatrCont > 650 & Shrub == "Few")
##   SubsDens WatrCont Substrate Shrub    Topo
## 1    64.75   691.79   Sphagn2   Few Blanket
## 2    62.38   708.16  Barepeat   Few Blanket

Filter - example selecting > 1 element

unique(mite.env$Substrate)
## [1] Sphagn1   Litter    Interface Sphagn3   Sphagn4   Sphagn2   Barepeat 
## Levels: Sphagn1 Sphagn2 Sphagn3 Sphagn4 Litter Barepeat Interface
mite.env %>% filter(Substrate %in% c("Litter", "Barepeat", "Interface"))
##    SubsDens WatrCont Substrate Shrub    Topo
## 1     54.99   434.81    Litter   Few Hummock
## 2     46.07   371.72 Interface   Few Hummock
## 3     80.59   266.78 Interface  Many Blanket
## 4     61.43   310.70    Litter  Many Blanket
## 5     37.25   239.51 Interface  Many Blanket
## 6     59.93   350.64 Interface  Many Blanket
## 7     35.41   321.87 Interface   Few Hummock
## 8     29.56   296.95 Interface  Many Hummock
## 9     44.10   383.83 Interface  Many Blanket
## 10    38.61   145.68 Interface  Many Hummock
## 11    32.27   291.59 Interface  Many Hummock
## 12    35.30   293.49 Interface  Many Blanket
## 13    32.86   323.12 Interface  Many Hummock
## 14    37.33   284.27 Interface  Many Blanket
## 15    53.17   367.11 Interface  Many Blanket
## 16    34.76   393.62 Interface   Few Blanket
## 17    47.74   528.44 Interface   Few Blanket
## 18    34.26   398.20 Interface   Few Blanket
## 19    26.60   386.37 Interface   Few Blanket
## 20    56.65   581.00 Interface   Few Blanket
## 21    62.38   708.16  Barepeat   Few Blanket
## 22    46.81   538.51 Interface   Few Blanket
## 23    33.98   323.96 Interface   Few Blanket
## 24    28.29   434.28 Interface  None Blanket
## 25    26.83   414.65 Interface  None Blanket
## 26    31.98   447.65 Interface  None Blanket
## 27    41.38   532.88 Interface  None Blanket
## 28    56.82   613.39  Barepeat  None Blanket
## 29    47.03   626.36 Interface  None Blanket
## 30    48.59   634.75 Interface  None Blanket
## 31    35.03   482.27 Interface  None Blanket

Histogram: concept

  • We use the command ggplot which comes from the ggplot2 package.
  • ggplot() just initiates the plot
  • then we tell it to draw a histogram with the geom_histogram() part.

ggplot(data = DATAFRAME, aes(x = COLUMN_FOR_HISTOGRAM)) + geom_histogram()

First example plot: histogram

ggplot(data = mite.env, aes(x = SubsDens)) +
  geom_histogram() 
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

First example plot: histogram

ggplot(data = mite.env, aes(x = SubsDens)) +
  geom_histogram(binwidth = 10) 

Reading in real data!

Actually, it should be real, but it should also be tidy

Reading in real data!

  • Where is the file
  • How do we tell R to get there
    • This also depends where R has the working directory
  • We need to tell R how to get from the working directory, to the file that we want to read in!

getwd()

setwd()

What happens with untidy data?

aMess <- read.csv("data/messyDataExample.csv")

head(aMess)
##   Ashburton.Lakes.weight.of.vegetation.harvest.subsample           X X.1
## 1                                                                       
## 2                                                  Date:                
## 3                                              Lab team:                
## 4                                                                       
## 5                                                                       
## 6                                                        Wet weight     
##            X.2          X.3          X.4                    X.5 X.6
## 1                                                                  
## 2                                                                  
## 3                                                                  
## 4                                                                  
## 5                                                                  
## 6 Wet weight 1 Wet weight 2 Wet weight 3 Average wet weight (g)    
##           X.7          X.8          X.9         X.10
## 1                                                   
## 2                                                   
## 3                                                   
## 4                                                   
## 5                                                   
## 6 Dry weight  Dry weight 1 Dry weight 2 Dry weight 3
##                     X.11 X.12 X.13 X.14 X.15 X.16 X.17 X.18 X.19 X.20 X.21
## 1                                    NA                                   
## 2                                    NA                                   
## 3                                    NA                                   
## 4                                    NA                                   
## 5                                    NA                                   
## 6 Average dry weight (g)             NA                                   
##   X.22 X.23 X.24 X.25 X.26 X.27 X.28 X.29 X.30 X.31 X.32
## 1             NA                                        
## 2             NA                                        
## 3             NA                                        
## 4             NA                                        
## 5             NA                                        
## 6             NA

What happens with untidy data?

  • To see the whole thing: View(aMess)

  • Compare the output of names(aMess) and names(mite.env)

Let's fix it, and read it back in

[group task]

Read in tidied data

  • Where is it
  • Where is the working directory?
  • Where is the file in relation to our working directory?
  • What did we call it (ie the filename)?!

  • Then we can read it back in

Read in Awarua data

  • Where is it?
  • Where is the working directory?
  • Where is the file in relation to our working directory?
  • What is the filename?

Awarua data

awarua <- read.csv("data/awaruaForestExample.csv")
head(awarua)
##   Depth..cm. DACCUP DACDAC FUSTYP PRUTAX LOPMEN PODOCA PHYLLO HALOCA
## 1          1     66     12      2      4      0      5      2      0
## 2          3     47      6      3     12      4      9     13      0
## 3          5     70      3      1      8      2     11     18      0
## 4          7    100      9      5     18      0     10     18      0
## 5          9     95     11      8      8      0      8     29      0
## 6         11     87     16      4     18      1      5     27      0
##   ELAEOC HOHERI WEINMA ASTERA COPROS MYRSIN DRATYP GAULTH RUBUS MUEHLE
## 1      6      0     56      0      4      9      0      0     0      0
## 2     24      1     66      3      7     16      1      1     1      1
## 3     13      0     35      0      4     22      0      0     0      0
## 4     14      0     37      3     11     23      0      2     0      0
## 5      9      0     50      2      8     17      1      0     1      0
## 6     12      0     17      0     18     21      0      0     0      2
##   PENNAN PSEUDO ARALIA POACEA LEPTYP GONTYP EMPTYP GLEICH MYRTYP PHORMI
## 1      0      0      1     74     12      0      0      1      0      0
## 2      0      4      5     27     25      0      0      0     10      2
## 3      3      1      0     13     47      0      0      1     13      0
## 4      0      1      0     14     55      0      1      0     44      1
## 5      0      0      4      9     82      0      1      4     30      1
## 6      1      1      1      4     82      0      0      3     26      0
##   RANTYP SPHAGN GUNNER NERTER CAREX BAUTYP SCHTYP PTEESC CYDTYP DICSQU
## 1      0      0      0      0     0      1      0      3      2      0
## 2      2      0      0      0     0      4      0      6      2      0
## 3      0      0      0      0     0      1      0      4      1      0
## 4      0      0      0      0     0      2      0      8      2      1
## 5      0      0      0      0     2      2      0      3      0      0
## 6      1      0      0      0     2      1      0      1      0      0
##   MONOL HISINC MICPUS LYCLAT LYCFAS PINACE KIRTYP RUMEX PCHARS PCHARL
## 1     3      2      2      0      0      8      0     0      4      0
## 2     2      0      2      0      0      0      1     0      5      0
## 3     3      1      0      1      0      1      3     0     22      0
## 4     7      0      2      0      0      0      1     0     10      0
## 5     5      0      0      0      0      0      0     1     26      0
## 6     5      1      3      0      0      0      0     3     45      0
##   BRYOPH NEOPTP PITTOS
## 1      0      0      0
## 2      0      0      0
## 3      1      0      0
## 4      0      0      0
## 5      0      0      0
## 6      0      0      0

Awarua - exercise

  • create a subset of the data.frame called "awaruaCharcoal"
  • we only want up to and including 20 cm
  • we only want the Depth column, and the two charcoal columns
  • bonus rename the depth column Depth..cm. to depth_CM
awaruaCharcoal <- awarua %>%
  [...]

Awarua - exercise answer

awaruaCharcoal <- awarua %>%
  filter(Depth..cm. <= 20) %>%
  select(depth_CM = Depth..cm., PCHARS, PCHARL)

head(awaruaCharcoal, 2)
##   depth_CM PCHARS PCHARL
## 1        1      4      0
## 2        3      5      0

Awarua - plot example

charcoalPlot <- ggplot(data = awaruaCharcoal, aes(x = depth_CM, y = PCHARS)) +
  geom_line() +    
  theme_classic() +
  labs(x = "Depth (cm)", y = "Charcoal", title = "Charcoal within Awarua wetland",
       subtitle = "Forest site")
charcoalPlot

Awarua - plot exercise - modifying plots

  • Can you change the title of the y-axis to something else?
  • Can you change the colour of the line?
  • [extension] try googling to see if you can find out how to reverse the direction of the x-axis

Awarua - plot exercise - creating a plot

  • Take the original awarua data (awarua), make a plot of depth vs DACCUP
  • Style it as you want
  • How would we add extra lines for DACDAC & MYRSIN?

Awarua - plot creation answer

  • NB there are easier ways of doing this - but we have to reshape the data first
awaruaPlot <- ggplot(awarua, aes(x = Depth..cm., y = DACCUP)) +
  geom_line() + 
  geom_line(aes(y = DACDAC), linetype = "dashed") + 
  geom_line(aes(y = MYRSIN), linetype = "dotted") +
  theme_classic() +
  scale_x_reverse() +
  coord_flip() +  # switches the x and y axes
  labs(y = "Count", x = "Depth (cm)")
awaruaPlot

Mutate - concept

  • Sometimes we want to change values in a column. For example, take the log or square-root.
  • Or we may want to calculate the percentage instead of counts
  • When we change values, we use the function mutate.
  • We give the name of the new column we want, and what we want to do to the old column

dataFRAME %>% mutate(logDACDAC = log(DACDAC))
dataFRAME %>% mutate(sqrtCHARCOAL_SML = sqrt(PCHARS))

Mutate - example

awaruaCharcoal %>%
  mutate(totalCharcoal = PCHARS + PCHARL) 
##    depth_CM PCHARS PCHARL totalCharcoal
## 1         1      4      0             4
## 2         3      5      0             5
## 3         5     22      0            22
## 4         7     10      0            10
## 5         9     26      0            26
## 6        11     45      0            45
## 7        13     26      0            26
## 8        15     17      1            18
## 9        17      5      2             7
## 10       19      0      0             0

Mutate - example

# just calculate rowSums
rowSums(awarua)

# but we don't want to sum the charcoal and depth columns!
rowSums(awarua %>% select(-c(Depth..cm., PCHARL, PCHARS)))

# also we'd like it actually as a new column in the dataframe...
# so we take the code above, give the new col a name ("allRows")
awaruaWithTotals <- awarua %>%
 mutate(allRows = rowSums(awarua %>% select(-c(Depth..cm., PCHARL, PCHARS))))

Mutate - example

  • to transform everything to percents,
  • we use a special form of mutate
  • and we choose which columns to
awaruaVeg <- awaruaWithTotals %>%
  select(-c(PCHARL, PCHARS, allRows, Depth..cm.)) %>% 
  mutate_all(.funs = function(eachColumn) {100 * eachColumn / awaruaWithTotals$allRows})

head(awaruaVeg)
##     DACCUP   DACDAC    FUSTYP   PRUTAX    LOPMEN   PODOCA    PHYLLO HALOCA
## 1 24.00000 4.363636 0.7272727 1.454545 0.0000000 1.818182 0.7272727      0
## 2 15.30945 1.954397 0.9771987 3.908795 1.3029316 2.931596 4.2345277      0
## 3 24.91103 1.067616 0.3558719 2.846975 0.7117438 3.914591 6.4056940      0
## 4 25.70694 2.313625 1.2853470 4.627249 0.0000000 2.570694 4.6272494      0
## 5 24.29668 2.813299 2.0460358 2.046036 0.0000000 2.046036 7.4168798      0
## 6 23.96694 4.407713 1.1019284 4.958678 0.2754821 1.377410 7.4380165      0
##     ELAEOC    HOHERI    WEINMA    ASTERA   COPROS   MYRSIN    DRATYP
## 1 2.181818 0.0000000 20.363636 0.0000000 1.454545 3.272727 0.0000000
## 2 7.817590 0.3257329 21.498371 0.9771987 2.280130 5.211726 0.3257329
## 3 4.626335 0.0000000 12.455516 0.0000000 1.423488 7.829181 0.0000000
## 4 3.598972 0.0000000  9.511568 0.7712082 2.827763 5.912596 0.0000000
## 5 2.301790 0.0000000 12.787724 0.5115090 2.046036 4.347826 0.2557545
## 6 3.305785 0.0000000  4.683196 0.0000000 4.958678 5.785124 0.0000000
##      GAULTH     RUBUS    MUEHLE    PENNAN    PSEUDO    ARALIA    POACEA
## 1 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.3636364 26.909091
## 2 0.3257329 0.3257329 0.3257329 0.0000000 1.3029316 1.6286645  8.794788
## 3 0.0000000 0.0000000 0.0000000 1.0676157 0.3558719 0.0000000  4.626335
## 4 0.5141388 0.0000000 0.0000000 0.0000000 0.2570694 0.0000000  3.598972
## 5 0.0000000 0.2557545 0.0000000 0.0000000 0.0000000 1.0230179  2.301790
## 6 0.0000000 0.0000000 0.5509642 0.2754821 0.2754821 0.2754821  1.101928
##      LEPTYP GONTYP    EMPTYP    GLEICH    MYRTYP    PHORMI    RANTYP
## 1  4.363636      0 0.0000000 0.3636364  0.000000 0.0000000 0.0000000
## 2  8.143322      0 0.0000000 0.0000000  3.257329 0.6514658 0.6514658
## 3 16.725979      0 0.0000000 0.3558719  4.626335 0.0000000 0.0000000
## 4 14.138817      0 0.2570694 0.0000000 11.311054 0.2570694 0.0000000
## 5 20.971867      0 0.2557545 1.0230179  7.672634 0.2557545 0.0000000
## 6 22.589532      0 0.0000000 0.8264463  7.162534 0.0000000 0.2754821
##   SPHAGN GUNNER NERTER     CAREX    BAUTYP SCHTYP    PTEESC    CYDTYP
## 1      0      0      0 0.0000000 0.3636364      0 1.0909091 0.7272727
## 2      0      0      0 0.0000000 1.3029316      0 1.9543974 0.6514658
## 3      0      0      0 0.0000000 0.3558719      0 1.4234875 0.3558719
## 4      0      0      0 0.0000000 0.5141388      0 2.0565553 0.5141388
## 5      0      0      0 0.5115090 0.5115090      0 0.7672634 0.0000000
## 6      0      0      0 0.5509642 0.2754821      0 0.2754821 0.0000000
##      DICSQU     MONOL    HISINC    MICPUS    LYCLAT LYCFAS    PINACE
## 1 0.0000000 1.0909091 0.7272727 0.7272727 0.0000000      0 2.9090909
## 2 0.0000000 0.6514658 0.0000000 0.6514658 0.0000000      0 0.0000000
## 3 0.0000000 1.0676157 0.3558719 0.0000000 0.3558719      0 0.3558719
## 4 0.2570694 1.7994859 0.0000000 0.5141388 0.0000000      0 0.0000000
## 5 0.0000000 1.2787724 0.0000000 0.0000000 0.0000000      0 0.0000000
## 6 0.0000000 1.3774105 0.2754821 0.8264463 0.0000000      0 0.0000000
##      KIRTYP     RUMEX    BRYOPH NEOPTP PITTOS
## 1 0.0000000 0.0000000 0.0000000      0      0
## 2 0.3257329 0.0000000 0.0000000      0      0
## 3 1.0676157 0.0000000 0.3558719      0      0
## 4 0.2570694 0.0000000 0.0000000      0      0
## 5 0.0000000 0.2557545 0.0000000      0      0
## 6 0.0000000 0.8264463 0.0000000      0      0

Mutate example - cont

  • Now we put the Depths column back with the vegetation
awaruaVegDepths <- data.frame(Depths = awaruaWithTotals$Depth..cm.,
                              awaruaVeg)

head(awaruaVegDepths)
##   Depths   DACCUP   DACDAC    FUSTYP   PRUTAX    LOPMEN   PODOCA    PHYLLO
## 1      1 24.00000 4.363636 0.7272727 1.454545 0.0000000 1.818182 0.7272727
## 2      3 15.30945 1.954397 0.9771987 3.908795 1.3029316 2.931596 4.2345277
## 3      5 24.91103 1.067616 0.3558719 2.846975 0.7117438 3.914591 6.4056940
## 4      7 25.70694 2.313625 1.2853470 4.627249 0.0000000 2.570694 4.6272494
## 5      9 24.29668 2.813299 2.0460358 2.046036 0.0000000 2.046036 7.4168798
## 6     11 23.96694 4.407713 1.1019284 4.958678 0.2754821 1.377410 7.4380165
##   HALOCA   ELAEOC    HOHERI    WEINMA    ASTERA   COPROS   MYRSIN
## 1      0 2.181818 0.0000000 20.363636 0.0000000 1.454545 3.272727
## 2      0 7.817590 0.3257329 21.498371 0.9771987 2.280130 5.211726
## 3      0 4.626335 0.0000000 12.455516 0.0000000 1.423488 7.829181
## 4      0 3.598972 0.0000000  9.511568 0.7712082 2.827763 5.912596
## 5      0 2.301790 0.0000000 12.787724 0.5115090 2.046036 4.347826
## 6      0 3.305785 0.0000000  4.683196 0.0000000 4.958678 5.785124
##      DRATYP    GAULTH     RUBUS    MUEHLE    PENNAN    PSEUDO    ARALIA
## 1 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.3636364
## 2 0.3257329 0.3257329 0.3257329 0.3257329 0.0000000 1.3029316 1.6286645
## 3 0.0000000 0.0000000 0.0000000 0.0000000 1.0676157 0.3558719 0.0000000
## 4 0.0000000 0.5141388 0.0000000 0.0000000 0.0000000 0.2570694 0.0000000
## 5 0.2557545 0.0000000 0.2557545 0.0000000 0.0000000 0.0000000 1.0230179
## 6 0.0000000 0.0000000 0.0000000 0.5509642 0.2754821 0.2754821 0.2754821
##      POACEA    LEPTYP GONTYP    EMPTYP    GLEICH    MYRTYP    PHORMI
## 1 26.909091  4.363636      0 0.0000000 0.3636364  0.000000 0.0000000
## 2  8.794788  8.143322      0 0.0000000 0.0000000  3.257329 0.6514658
## 3  4.626335 16.725979      0 0.0000000 0.3558719  4.626335 0.0000000
## 4  3.598972 14.138817      0 0.2570694 0.0000000 11.311054 0.2570694
## 5  2.301790 20.971867      0 0.2557545 1.0230179  7.672634 0.2557545
## 6  1.101928 22.589532      0 0.0000000 0.8264463  7.162534 0.0000000
##      RANTYP SPHAGN GUNNER NERTER     CAREX    BAUTYP SCHTYP    PTEESC
## 1 0.0000000      0      0      0 0.0000000 0.3636364      0 1.0909091
## 2 0.6514658      0      0      0 0.0000000 1.3029316      0 1.9543974
## 3 0.0000000      0      0      0 0.0000000 0.3558719      0 1.4234875
## 4 0.0000000      0      0      0 0.0000000 0.5141388      0 2.0565553
## 5 0.0000000      0      0      0 0.5115090 0.5115090      0 0.7672634
## 6 0.2754821      0      0      0 0.5509642 0.2754821      0 0.2754821
##      CYDTYP    DICSQU     MONOL    HISINC    MICPUS    LYCLAT LYCFAS
## 1 0.7272727 0.0000000 1.0909091 0.7272727 0.7272727 0.0000000      0
## 2 0.6514658 0.0000000 0.6514658 0.0000000 0.6514658 0.0000000      0
## 3 0.3558719 0.0000000 1.0676157 0.3558719 0.0000000 0.3558719      0
## 4 0.5141388 0.2570694 1.7994859 0.0000000 0.5141388 0.0000000      0
## 5 0.0000000 0.0000000 1.2787724 0.0000000 0.0000000 0.0000000      0
## 6 0.0000000 0.0000000 1.3774105 0.2754821 0.8264463 0.0000000      0
##      PINACE    KIRTYP     RUMEX    BRYOPH NEOPTP PITTOS
## 1 2.9090909 0.0000000 0.0000000 0.0000000      0      0
## 2 0.0000000 0.3257329 0.0000000 0.0000000      0      0
## 3 0.3558719 1.0676157 0.0000000 0.3558719      0      0
## 4 0.0000000 0.2570694 0.0000000 0.0000000      0      0
## 5 0.0000000 0.0000000 0.2557545 0.0000000      0      0
## 6 0.0000000 0.0000000 0.8264463 0.0000000      0      0